12,233 research outputs found

    Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation

    Full text link
    The recent advances in deep learning have made it possible to generate photo-realistic images by using neural networks and even to extrapolate video frames from an input video clip. In this paper, for the sake of both furthering this exploration and our own interest in a realistic application, we study image-to-video translation and particularly focus on the videos of facial expressions. This problem challenges the deep neural networks by another temporal dimension comparing to the image-to-image translation. Moreover, its single input image fails most existing video generation methods that rely on recurrent models. We propose a user-controllable approach so as to generate video clips of various lengths from a single face image. The lengths and types of the expressions are controlled by users. To this end, we design a novel neural network architecture that can incorporate the user input into its skip connections and propose several improvements to the adversarial training method for the neural network. Experiments and user studies verify the effectiveness of our approach. Especially, we would like to highlight that even for the face images in the wild (downloaded from the Web and the authors' own photos), our model can generate high-quality facial expression videos of which about 50\% are labeled as real by Amazon Mechanical Turk workers.Comment: 10 page

    AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

    Full text link
    In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation. With a novel attentional generative network, the AttnGAN can synthesize fine-grained details at different subregions of the image by paying attentions to the relevant words in the natural language description. In addition, a deep attentional multimodal similarity model is proposed to compute a fine-grained image-text matching loss for training the generator. The proposed AttnGAN significantly outperforms the previous state of the art, boosting the best reported inception score by 14.14% on the CUB dataset and 170.25% on the more challenging COCO dataset. A detailed analysis is also performed by visualizing the attention layers of the AttnGAN. It for the first time shows that the layered attentional GAN is able to automatically select the condition at the word level for generating different parts of the image

    Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

    Full text link
    We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.Comment: Accepted to AAAI 201

    Semileptonic BB Meson Decays Into A Highly Excited Charmed Meson Doublet

    Full text link
    We study the heavy quark effective theory prediction for semileptonic BB decays into an orbital excited FF-wave charmed doublet, the (2+2^{+}, 3+3^{+}) states (D2D^{*'}_{2}, D3D_{3}), at the leading order of heavy quark expansion. The corresponding universal form factor is estimated by using the QCD sum rule method. The decay rates we predict are ΓBD2ν=1.85×1019GeV\Gamma_{B\to D^{*'}_{2}\ell\overline{\nu}}=1.85\times10^{-19} {GeV} and ΓBD3ν=1.78×1019GeV\Gamma_{B\to D_{3}\ell\overline{\nu}}=1.78\times10^{-19} {GeV}. The branching ratios are B(BD2ν)=4.6×107\mathcal {B}(B\to D_{2}^{*'}\ell\overline{\nu})=4.6\times10^{-7} and B(BD3ν)=4.4×107\mathcal {B}(B\to D_{3}\ell\overline{\nu})=4.4\times10^{-7}, respectively.Comment: 6 pages,2 figure

    Distributed Multicell Beamforming Design Approaching Pareto Boundary with Max-Min Fairness

    Full text link
    This paper addresses coordinated downlink beamforming optimization in multicell time-division duplex (TDD) systems where a small number of parameters are exchanged between cells but with no data sharing. With the goal to reach the point on the Pareto boundary with max-min rate fairness, we first develop a two-step centralized optimization algorithm to design the joint beamforming vectors. This algorithm can achieve a further sum-rate improvement over the max-min optimal performance, and is shown to guarantee max-min Pareto optimality for scenarios with two base stations (BSs) each serving a single user. To realize a distributed solution with limited intercell communication, we then propose an iterative algorithm by exploiting an approximate uplink-downlink duality, in which only a small number of positive scalars are shared between cells in each iteration. Simulation results show that the proposed distributed solution achieves a fairness rate performance close to the centralized algorithm while it has a better sum-rate performance, and demonstrates a better tradeoff between sum-rate and fairness than the Nash Bargaining solution especially at high signal-to-noise ratio.Comment: 8 figures. To Appear in IEEE Trans. Wireless Communications, 201

    Unconventional Superconducting Symmetry in a Checkerboard Antiferromagnet

    Full text link
    We use a renormalized mean field theory to study the Gutzwiller projected BCS states of the extended Hubbard model in the large UU limit, or the tt-tt'-JJ-JJ' model on a two-dimensional checkerboard lattice. At small t/tt'/t, the frustration due to the diagonal terms of tt' and JJ' does not alter the dx2y2d_{x^2-y^2}-wave pairing symmetry, and the negative (positive) t/tt'/t enhances (suppresses) the pairing order parameter. At large t/tt'/t, the ground state has an extended s-wave symmetry. At the intermediate t/tt'/t, the ground state is d+idd+id or d+isd+is-wave with time reversal symmetry broken.Comment: 6 pages, 6 figure

    Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition

    Full text link
    In the recent year, state-of-the-art for facial micro-expression recognition have been significantly advanced by deep neural networks. The robustness of deep learning has yielded promising performance beyond that of traditional handcrafted approaches. Most works in literature emphasized on increasing the depth of networks and employing highly complex objective functions to learn more features. In this paper, we design a Shallow Triple Stream Three-dimensional CNN (STSTNet) that is computationally light whilst capable of extracting discriminative high level features and details of micro-expressions. The network learns from three optical flow features (i.e., optical strain, horizontal and vertical optical flow fields) computed based on the onset and apex frames of each video. Our experimental results demonstrate the effectiveness of the proposed STSTNet, which obtained an unweighted average recall rate of 0.7605 and unweighted F1-score of 0.7353 on the composite database consisting of 442 samples from the SMIC, CASME II and SAMM databases.Comment: 5 pages, 1 figure, Accepted and published in IEEE FG 201

    Bulge formation from SSCs in a responding cuspy dark matter halo

    Get PDF
    We simulate the bulge formation in very late-type dwarf galaxies from circumnuclear super star clusters (SSCs) moving in a responding cuspy dark matter halo (DMH). The simulations show that (1) the response of DMH to sinking of SSCs is detectable only in the region interior to about 200 pc. The mean logarithmic slope of the responding DM density profile over that area displays two different phases: the very early descent followed by ascent till approaching to 1.2 at the age of 2 Gyrs. (2) the detectable feedbacks of the DMH response on the bulge formation turned out to be very small, in the sense that the formed bulges and their paired nuclear cusps in the fixed and the responding DMH are basically the same, both are consistent with HSTHST observations. (3) the yielded mass correlation of bulges to their nuclear (stellar) cusps and the time evolution of cusps' mass are accordance with recent findings on relevant relations. In combination with the consistent effective radii of nuclear cusps with observed quantities of nuclear clusters, we believe that the bulge formation scenario that we proposed could be a very promising mechanism to form nuclear clusters.Comment: 27 pages, 11 figures, accepted for publication in Ap
    corecore